Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to Remove Lower and Upper Bounds from Models #459

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

yalsaffar
Copy link
Contributor

@yalsaffar yalsaffar commented Nov 19, 2024

Description:
This refactor addresses issue #379 where AEPsych models currently depend on lower and upper bounds (lb and ub) being passed into their constructors. The goal is to shift the responsibility of managing bounds to the strategy, allowing models to focus solely on the dimensionality of the space. This change reduces boilerplate and improves flexibility, as bounds are now passed as parameters when needed, rather than being stored within the model itself.

LookAhead:

  • Updated GlobalLookaheadAcquisitionFunction class and its subclasses to accept lb and ub from OptimizeAcqfGenerator, as it relied on model.ub and model.lb to compute Xq before. However, this information was already passed to the OptimizeAcqfGenerator.
  • Updated OptimizeAcqfGenerator to initialize the acquisition functions that require lb and ub with these parameters.
  • Updated MonotonicRejectionGenerator and other generators to accept lb and ub directly and not rely on model.lb and model.ub.

In models/base.py - Massive Refactor:

  • Moving Methods to Strategy:

    • Moved get_max, get_min, and inv_query methods to Strategy, as these methods rely on having lb and ub in the model.
    • Also moved get_jnd to Strategy, as it is not needed in the models but only in the strategy.
  • Dim Grid Handling:

    • dim_grid was moved to Strategy, but it was kept in base.py temporarily as PairwiseProbitModel is not yet refactored to not use lb and ub (this should be handled in a separate PR).
    • The good thing about this is that all of these methods rely on functions from aepsych.utils and can simply be called in the Strategy itself.
  • Simplified AEPsychMixin:

    • The simplification of AEPsychMixin paves the way for the next PR.
  • Enhancements in Strategy:

    • All of these methods were added as helper functions in Strategy to be called from the original methods.
    • Also, _is_first_fit was added to ensure that it fits the first time around and doesn’t skip that step before updating right away.

In Models:

  • Transition from lb and ub to inducing_points:

    • All models (except PairwiseProbitModel) that previously relied on lb and ub are now moved to working with inducing_points.
  • Conditions for inducing_points:

    • If points are provided AND SobolAllocator is provided:
      • The points will be overwritten by the allocator’s output.
    • If points are not provided AND SobolAllocator is provided:
      • The points will simply be allocated using the Sobol allocator.
    • If points are not provided, and dim is provided:
      • A dummy set of points will be created.
    • If neither points nor dim are provided:
      • An error will be raised as there is not enough information to create the points (and if no SobolAllocator is provided).
    • If points are provided AND dim is None:
      • The dim will be inferred from the shape of the points tensor.
  • NOTE: The focus on SobolAllocator is that it does not require input (unlike the other allocators) and cannot be initialized with them unless data is passed at the beginning, which is uncommon.

  • NOTE2: in MonotonicProjectionGP, self.lb[dim] was replaced with self.inducing_points.min(dim=0).values.

  • Inducing Point Method:

    • The inducing point method is later set either by input or by default to AutoAllocator.

Tests:

Tests have been structured around the changes and refactors made in the codebase, focusing on initialization, inducing points, and configurations.

  1. Tests Related to Initialization and Inducing Points:

    • Tests like:
      • tests/acquisition/test_mi.py
      • tests/generators/test_epsilon_greedy_generator.py

    These tests verify the initialization and the calculation of inducing points. The inducing points were calculated as before (using SobolAllocator), but instead of being computed within the model’s class, they are now calculated prior to initialization and passed directly to the model from the generator.

  2. Tests Related to Configurations and Dummy Inducing Points:

    • Tests like:
      • tests/models/test_gp_regression.py
      • tests/models/test_monotonic_projection_gp.py
      • tests/test_config.py
      • ...

These tests focus on configurations, ensuring that the dim parameter is provided to create the dummy inducing points when no inducing points are explicitly given. In some cases, the allocator (e.g., SobolAllocator) is used for initialization, but it is not desired for the entire lifecycle of the model—just for the initial allocation of points.

Note: One test case has been failing and I am not sure why, in:


aepsych\tests\test_strategy.py", line 151, in test_finish_criteria self.assertTrue(self.strat.finished) AssertionError: False is not true

Where it originated from this condition not being met with start.finished method:


fmean, _ = self.model.predict(self.eval_grid, probability_space=True)

is approximately 0.1, while the threshold set before is 0.3.

I would love to hear your thoughts on the root of this issue.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 19, 2024
@JasonKChow
Copy link
Contributor

Moving the methods (like get_min()) out of the models and into the strategy class is probably not the correct path. We are trying to better encapsulate what strategy does (that is to run an experiment) whereas these methods are often used for post-hoc analysis.

I think what should happen is the get_min() methods should be separated out as utility functions (probably going into the utils.py file in models). You noticed that they already call functions from there so it may as well live there too standalone. These utility functions should accept model as the first argument (following the utility functions they call) as well as bounds. The bounds should either be explicitly set or it could be inferred from the model's data (if available; with warning).

--

I'm not convinced that inducing_points should be a separate argument for models. There's multiple reasons:

  • As corollary to wanting to remove ub/lb from models, we also don't want dim there, right now dim is only used for inducing points it seems?
  • This appears to become a complex set of rules between inducing_points, dim, and inducing_point_method.
  • The model call signatures are already pretty complex, limiting the bloat is good

Given that all three arguments appear to be strictly for inducing points, we should try to encapsulate it all in InducingPointAllocator (or at least the one argument). I think the best way to do this is to require a inducing_point_method which is a InducingPointAllocator and have any configuration logic deal with it. This may require the addition of new InducingPointAllocator classes (one for fixed inducing points, and maybe one for dummy). This will unify the API and push the complexities of setting up inducing points to a separate class with its own guard rails.

--

Not actually sure why that test is failing. Nothing should have changed that. Did adding data to models somehow change? It should be easily higher than the 0.3 threshold (it should be around 0.5, looking at the code).

Wait for @crasanders opinion on direction before moving on.

@crasanders
Copy link
Contributor

Moving the methods (like get_min()) out of the models and into the strategy class is probably not the correct path. We are trying to better encapsulate what strategy does (that is to run an experiment) whereas these methods are often used for post-hoc analysis.

I think what should happen is the get_min() methods should be separated out as utility functions (probably going into the utils.py file in models). You noticed that they already call functions from there so it may as well live there too standalone. These utility functions should accept model as the first argument (following the utility functions they call) as well as bounds. The bounds should either be explicitly set or it could be inferred from the model's data (if available; with warning).

I like the idea of making the get_min methods of Model classes separate functions, although I think the bounds should always have to be explicitly passed. We do have to keep the methods on the Strategy class for now because of the way server messages are handled, but we could refactor that later.

--

I'm not convinced that inducing_points should be a separate argument for models. There's multiple reasons:

  • As corollary to wanting to remove ub/lb from models, we also don't want dim there, right now dim is only used for inducing points it seems?
  • This appears to become a complex set of rules between inducing_points, dim, and inducing_point_method.
  • The model call signatures are already pretty complex, limiting the bloat is good

Given that all three arguments appear to be strictly for inducing points, we should try to encapsulate it all in InducingPointAllocator (or at least the one argument). I think the best way to do this is to require a inducing_point_method which is a InducingPointAllocator and have any configuration logic deal with it. This may require the addition of new InducingPointAllocator classes (one for fixed inducing points, and maybe one for dummy). This will unify the API and push the complexities of setting up inducing points to a separate class with its own guard rails.

I agree that we should get rid of dim if we can. And after our conversation earlier, I agree that the cleanest solution is to add new InducingPointAllocators, one for fixed inducing points, and one for dummy ones. To make code review easier, it might be better to make a new PR that implements those, and then rebase this one on top of it.

--

Not actually sure why that test is failing. Nothing should have changed that. Did adding data to models somehow change? It should be easily higher than the 0.3 threshold (it should be around 0.5, looking at the code).

I'll take a closer look after these other changes have been made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants